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We use the halo model formalism to provide expressions for cluster abundances and bias, as 
well as estimates for the correlation matrix between these observables. Off-diagonal elements due 
to scatter in the mass tracer scaling with mass are included, as are observational effects such as 
biases/scatter in the data, detection rates (completeness), and false detections (purity). We apply 
. . . the formalism to a hypothetical volume limited optical survey where the cluster mass tracer is chosen 

\l ' to be the number of satellite galaxies assigned to a cluster. Such a survey can strongly constrain 

I (Tg (Ao-g « 0.05), the power law index a where {Ngai\m) — (m/Mi)" {Aa « 0.03), and perhaps 

■ even the Hubble parameter (Ah ~ 0.07). We find cluster abundances and bias are not well suited 

for constraining or the amplitude Mi . We also find that without bias information ag and a are 
r— H I degenerate, implying constraints on the former are strongly dependent on priors used for the latter 

, ^ . and vice-versa. The degeneracy stems from an intrinsic scaling relation of the halo mass function, 

and hence it should be present regardless of the mass tracer used in the survey. 

o ■ 
(N : 

^ ■ I. INTRODUCTION 

m ■ 

>: 

OO I Two of the simplest measures of how matter is distributed throughout the universe are its average density, 
r parametrized by the ratio of the matter density to the critical density, fim, and its power spectrum P{k). The 
' amplitude of the power spectrum today is usually characterized by , the average rms mass fluctuation in spheres of 
, 8/i^^Mpc. Accurate determinations of both a% and Q,m are of crucial importance to cosmology as they provide some 
' of the simplest probes of large scale structure. 

I Cluster abundances are well known for their ability to constrain both cg and fi™ (see e.g. Qi 13 '[13 '112' 

' H^iQi mi and references therein). This seems intuitively reasonable: the number of large massive objects ought to 
' depend on the total mass available (fim) and a measure of how likely is it for mass to clump at cluster scales (erg). More 
II . formally, from numerical simulations (115,0]) and theoretical considerations (^^Ij ESI) know to a reasonable 
O ' accuracy what the halo mass function looks like in various cosmologies. Even though mass is not directly observable, 
^ by identifying a mass tracer and its relation to halo mass one may hope to constrain cosmology. Some examples of 
^ mass tracers in clusters are X-ray temperatures and luminosities of the inter-cluster gas, optical luminosities, and the 
. . \ number of galaxies found in the cluster. 

^ • The actual analysis of data may be quite involved. In particular, not only is it necessary to understand how a 
mass tracer scales with halo mass, one also needs to understand both what the uncertainties in the scaling are and 
the intrinsic scatter around the mean relation between the mass tracer and halo mass (see e.g. Pierpaoli et al. ,40|, 
' Viana et al. [s^ for two recent, detailed analyses). 

Here, we develop simple expressions for the number density and bias of clusters binned using an arbitrary mass 
tracer. While much effort has been devoted to converting observations to the theoretically simple mass function, 
here we attempt to massage the theory to fit the observations: i.e. use the halo model to make predictions for any 
experiment. We attempt to include many of the most relevant experimental effects, including intrinsic scatter in the 
mass tracer, scatter and/or bias arising from experimental measurements, and imperfect detection rates and false 
detections. We feel this is important since it allows direct comparison of theory to data: by minimizing the amount 
of data manipulation, the probability of artificially biasing the data is diminished. 

We also provide theoretical estimates of the correlation matrix between various bins. Our estimates include Poisson 
noise, sample variance, and uncertainties due to scatter in the scaling of the mass tracer with mass, the treatment of 
which we believe is new. 

Following the development of our formalism, we apply it to a model cluster catalogue that mimics the type of 
catalogues one can construct from large optical surveys such as the SDSS [s^ or the 2dF For concreteness, we 
take the mass tracer to be the number of member galaxies in a cluster. This has the interesting consequence that we 
can analyze both cosmological constraints and constraints on how galaxies populate halos. 

From the cosmological point of view, our results are of importance since any new measurements of erg may help 
narrow the large range of measured values for this quantity. More importantly perhaps, our analysis identifies 
degeneracies between cosmology and the halo occupation distribution. Not only is this an interesting problem in itself 
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(see e.g. Zheng et al. [sj, Berlind & Weinberg but the identification of degeneracies in our survey suggests that, 
in general, there will be degeneracies between cosmology and the mass tracer scaling relation. Said degeneracies may 
help bring into agreement seemingly conflicting results for as obtained with different assumptions of the mass tracer 
scaling relation. 

From the point of view of constraining galaxy formation, determining both cosmology and the halo occupation 
distribution simultaneously is important since it avoids possible systematic errors that may arise from the choice of an 
incorrect cosmology (again coming back to the question of degeneracies). Further, we believe that considering cluster 
abundances and large scale bias has the important advantage that neither halo profiles nor second order moments 
of the halo occupation distribution appear in our formulae explicitly. This makes our results very insensitive to said 
variables, thereby providing a first stepping stone toward the full determination of the halo occupation distribution. 
In particular, this type of analysis should complement well halo occupation constraints from galaxy clustering (see 
e.g. Jing, Mo, and Borner Scoccimarro et al. ji^, Moustakas and Somcrville [s^, Cooray ^^), as well as more 
sophisticated studies of halo occupancy such as work based on the conditional luminosity function (see Yang, Mo and 
van den Bosch 54] and van den Bosch, Mo, and Yang 51]). 

In section 2, we develop our formalism and find expressions for the number density and bias of clusters binned 
according to measurements of an arbitrary mass tracer. In section 3 we identify the various sources of uncertainty 
intrinsic to observations, i.e. uncertainties that would exist even for a perfect experiment. In section 4, we show 
how to include various experimental effects both in the predictions for what will be observed and in the uncertainties 
associated with the data. Having finished our formalism, we present in section 5 the assumptions for our model survey 
and characteristics of the assumed cluster catalogue. These are supposed to mimic the real catalogues which one may 
expect to construct with surveys such as the SDSS and 2dF. Section 5 sets up our fiducial model and states all 
assumptions used to obtain our results, which are presented in section 6. Section 7 addresses how the results change if 
we do not have bias information and consider cluster abundances alone. We present our conclusions in section 8. Also 
included as an appendix is a more thorough discussion of what is usually called the cluster abundance normalization 
condition (crs^^m « 0.5, 7 « 0.5) than the one presented in the main text. 

II. HALO MODEL FORMALISM AND CLUSTER STATISTICS: 
A. The Halo Model Approach 

The Halo Model is a theoretical framework developed to understand clustering properties of different mass tracers 
in the universe. The halo model does this by dividing the problem in two: first, it assumes all mass in the universe 
is distributed in units called halos. The halo model then assumes that all properties of mass tracers within a halo 
(e.g. galaxies. X-ray temperature, etc.) are determined exclusively by the physical properties of the parent halo (e.g. 
mass, angular momentum, and so on).^ 

Let then 77 be our mass tracer, e.g. X-ray temperature/luminosity, optical luminosity, or number of member 
galaxies. We will be interested in the clustering properties of halos as a function of the tracer 77. In particular, we 
will be interested in the density and bias of halos for an arbitrary binning criterion tpirj). For instance, one may wish 
to bin clusters by specifying the maximum and minimum values rj may take for a cluster to be included in a specific 
bin. This corresponds to a top- hat selection function ipiv) — 1 when rjmax > > Vmin and ipiv) = otherwise. 
Since we will be interested in medium and large mass halos, we will be using the terms halos, groups, and clusters 
interchangeably. 

B. Cluster Density 

Let us write then the cluster density in the halo model formalism. Let the j*'' halo be located at position Xi and 
let rji be the value of the mass tracer rj for that particular halo. Then the density of objects where 77 is larger than 
some specified minimum value rjmin is given by 

ncljx) = y^Jjx - Xi)9{f]i - f]jnin) (1) 



In t he simples t ca ses, halos are taken to be spherical distributions of dark matter, with some specified density profile, typically an NFW 
or Moore Q profile. For our purposes, neither the shape nor the mass distribution of the halos will be important. 
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where the sum is over all halos. Here, 0{x) is the usual step function, i.e. 9{x) = 1 for x > and 9{x) = otherwise, 
so that a halo contributes to the density if and only if rj > rj^in- Alternatively, one may be interested in some other 
selection criteria, e.g. looking at objects with -qmax > ?7 > ?/mm, characterized by a top-hat function as discussed 
above. Let ipaiv) represent an arbitrary window function used to bin data, where a contains the parameters that 
specify the binning. In this case, the above expression becomes 

nU^) = ^ (5(f - x^)^ps{v^) (2) 

i 

where the superscript * is meant to signify that this is the true density of objects in our bin: i.e what we would observe 
with perfect instruments. Systematic effects brought about by observations may lead to an observed density ^ nt. 
We fold these into our formalism in §4. 

We now make use of the ergodic hypothesis, and assume that the spatial clustering statistics of any field are identical 
to those obtained upon averaging the corresponding expressions over a hypothetical ensemble of universes. Using () 
to denote ensemble average, and an over bar ~ to denote spatial averaging, we expect the spatially averaged cluster 
density to be: 

nl^{J25{S-x^)Mv^)) (3) 

i 

In order to obtain expectation values, one needs to know then the probability distribution for yy^. We make the 
standard assumption that the value of the observable rj for a halo of mass m is a random variable with a probability 
distribution P{r]\'m) depending only on the mass m of the halo. For instance, the assumption of hydrostatic equilibrium 
in clusters allows one to relate the observed X-ray temperature to the mass of the cluster. Likewise, simulations seem 
to indicate that the number of galaxies TV in a halo of mass m is relatively insensitive to environment, so that P{N) 
depends only on m |^.[3^. 

Writing rji = ri{mi), where is a random variable for each mass value m^, we can rewrite the expression above as: 



= \^5{x ~ Xi)-il:s{r]{mi)) 

i 

= I dmCS^S{m — 'mi)S{x — Xi)'ips{r]{m)) 
Jo \ , 

= J dm(y^J{m - mi)5{x - Xi)") (ipsivim))) 



Notice that is only because we are assuming that ry is independent of cosmology that we can take ipsivi''^)) of the 
ensemble average above, and consider its average value over the halo occupation distribution alone. ^ We now define 
the quantity 

n{m,x) = ''^^d{'m ~ rni)S{x ~ Xi), (4) 

i 

which represents the true halo density field, the spatial average of which is called the halo mass function n{m). We 
obtain then that the number density of groups in a bin a is 







dm {n{m, x)) (V'al??!"^))) 

dmn{m){ip^\m). (5) 
where {ips\m) is meant to be the average value of V'a('?(w)) over the probability distribution P(ri\jn). 



^ This is a crucial and strong assumption. In particular, any dependence of the chosen mass tracer on the large scale environment would 
bias our results. 
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C. The Cluster-Cluster Correlation Function and Power Spectrum 

Following an argument similar to the one above we may obtain an expression for the cluster-cluster correlation 
function ^s.S'i''') between objects in bins a and a' (here r = \x ~ x'\): 

a a' 

' , , I > , , I {n{m,x)n{m',x')) 

dm dm (iJs\m}(ips'\m ) —r- 

n-n-., 

a a' 

' 1 7 //, I \/, I ,.n{m)n{m'), y. , , 

dmdm, (V's w) ) tt-tt (1 +£,hh[r m,m, )) 

ntnt, 

a a 

.1 ^ 'c f I '^/ / I \/ ; I /\ n{m)n{m') 
dmdm ^hh(r\m,m ){'ipE\m){ips'\m ) — rj-rt 



a a' 

where S,hh{f'\m^ m') is the halo correlation function between halos of different masses m and m' separated by distance 
r. The upper script * on -, serves here again to remind us that this is the true cluster correlation function. The 
observed correlation function may differ from g, due to systematics effects in the observations, to be included later. 
From our discussion concerning the average cluster density, we already know how to handle ipsiju) and n{m). The 
only term in the above expression which is new is ^hh- 

The halo model provides a prescription to relate Sh, the halo overdensity, to the matter overdensity. This same 
prescription relates the halo correlation function to the linear correlation function. To first order in perturbation 
theory, one obtains (see e.g. = K^)^rn where 

9 Inn (to) 



b{m) = 1 - 



is the linear bias ^3^- Here Ssc is the critical overdensity needed for collapse, Ssc = 1.686; q — 0.75 and p = 0.3 come 
from fitting to A'^-body simulations; and v = 6'^^/af^^{m), with criin being the tophat filtered rms fluctuations in the 
linear density field. The radius R of the top-hat filter used in defining a{m) is obtained by demanding that a sphere 
of radius R encompasses a total mass to, i.e. QmpATrR'^ /3 = m, with pc the critical density of the universe. 

In this limit, the halo-halo correlation function becomes ^/ih(r|TO, to') = b{m)b(m')^iin(r) and thus the cluster-cluster 
correlation function simplifies to: 

^s,a'ir)=bib'Mr) (7) 

where 

b~= dm b{m)iis{rn). (8) 







Not surprisingly, we see that the cluster-cluster correlation function traces the underlying mass correlation function. 
It is worth pointing out that observationally the power spectra of galaxies and clusters are seen to have the same 
shape but different amplitude on scales fc~^ ^ 10 Mpc or larger, so that the siniple scale-independent linear bias is 
indeed feasible on large scales (in fact, it is necessarily so for guassian fields |4|). 



III. INTRINSIC ERRORS ESTIMATE 



We discuss now three intrinsic uncertainties associated with our observables: Poisson errors, sample variance, and 
uncertainties due to intrinsic scatter in the mass-tracer to mass scaling relation. Keeping an eye on our choice of 
model survey in §V, we assume a volume limited sample such that all objects of interest (i.e. all halos with r; in the 
range of interest) are detected. Further, we assume no contamination of the sample, and a perfect instrument so that 
the observed value if^" of the mass tracer always matches the true value rf'^^'^ , Experimental bias and scatter are 
treated in §IV. 



We have chosen to denote this quantity b since the expression depends on the average halo mass function n{m) rather than 11(171, x). 
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A. Poisson Uncertainties and Sample Variance 



The first type of uncertainty is tlie Poisson error in the number of clusters found. Assuming no intrinsic clustering, 
the variance in the density of clusters is simply given by n^^/V . This contribution to the correlation matrix is therefore 

^B'\po.sson ^ (i^^ - 4) {n,' - ni,)r'-' = 6,,,,^. (9) 

Note the use of the symbol Cs,a' for the correlation matrix. We include a ~ above the C to identify it as the "intrinsic" 
correlation matrix, meaning that no observational effects have been taken into account. There is also a Poisson term 
in the bias arising from the estimation of the clustering properties with a finite number of galaxies (namely the secon 
term in equation 34) which is treated later on to properly account for contamination and completeness. 

In addition to this, Hu and Kravtsov showed that sample variance becomes increasingly important as we probe 
lower and lower mass scales |22j in the halo mass function. We rederive here the result for sample variance found in 
Hu and Kravtsov, to use it as a reference for deriving the sample variance errors involving bias. 

Assuming we have averaged over P(r]\m) (uncertainties associated with this probability are derived below), the 
sample variance contribution to the density-density correlation matrix is given by 

^S^S'lsampie ^ j xcP ^'W {x)W {x') j dmdm' {^ljs\m) \m') {{n{m, x)n{m\ x')) - n{m)n{m')} 

where W{x) is the survey's window function. Note we do not include terms proportional to simultaneous deviations 
from the mean due to sample variance and the variance of ips due to P{ri\m) as these would yield only small corrections. 

We approximate W{x) above to be a spherical top- hat function encompassing a volume equal to that of the 
survey volume. The matrix element can now be easily computed by replacing n(m, x) in the above expressions with 
n(m) -|- Sn{m, x) where 6n = b(rn)n(rn)S. Since (5) = and (Sd) ~ S,ix — x'), we obtain then 

= nlni,bibi,a\Rv). (10) 

where Ry is given by AnRy/S = V. This is the final result that we were looking for, namely an expression for the 
covariance matrix between the number of objects found in each bin due to sample variance. The total density-density 
correlation matrix is obtained then by adding the sample variance matrix and the Poisson noise. 

Let us now turn towards matrix elements involving bias. As before, the sample variance contribution is 

Again, we replace n{m,x) by n{m) + Sn{m,x) and b{m,x') by b{m) -f Sb{m,x). To get an expression for Sb, we 
generalize Eq. © to unbarred b and n, so that 

b + 5b = l-_(ln(n(H-M))) (11) 

OOsc 

where the second equality holds to leading order in S. Hence 

With this result, the sample variance contribution to CSg, may be written as 

L_,,, = fmM,4^,a^{R^) (14) 



where 



n{m)b{m) j , ^, db 



I = / dm ' ^,{m)i- — ) (15) 

In exactly the same way we obtain the bias-bias terms of the correlation matrix, which are given by 

^tAsam,U = Wa'ClcWiRv)- (16) 
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B. Mass Tracer Dispersion Errors 

By mass tracer dispersion errors we mean the uncertainties in the density and bias of objects in a given bin a due to 
the fact that 77 is not uniquely determined by m, i.e. uncertainties due to the probabihty distribution P{rj\m). Note 
these are somewhat analogous to Poisson uncertainties in the number of galaxies in a spatial bin. Not surprisingly, 
then, the uncertainties take on a form 5n ~ n/V where V is the volume of the survey. 

Consider then a realization of the halo model. Given the survey volume V ^ the number density of objects in bin a 
in one realization may be written as 



where Vi is the number of halos in the mass bin rrii. The mass binning must be chosen small enough so that at most 
one halo is found in any given mass bin. This ensures ry^ is a random variable for each halo (i.e. that rji and r]j are 
uncorrelated for any two halos z, j). We have then 



n%n%, = -^^[^v,Vj{%l;s(r^(mi))%l)a'{v(.'mj))) + (V'a(»7(mj))i/^a'(rKw*)))} 



i j^i 

We have used above that ij^rrii) and rjirrij) are not correlated, and that vf = Vi (since Vi = 0, 1). We may write a 
similar expression for nt-n^g, , 

^s^S' = -^^[^ {'4'a\mi) {'4!s,mj) + i^i{ips\mi){ips>\mi)^ (18) 
so upon subtracting Eq. (|18|l from Eq. (|17|) . we find 

r""" I — — ,,.r^^ (m ■\ 

^s,S'\mass tracer ~ y2 2^'^^'-'s,a'\"h) 

i 

where 

^sI'M = (V'a(r?(m))Va'(^(m))) ~ i^sHi^s'lm) (19) 

But note that i>i/V is just the average number density of halos in a bin of mass rrii . Averaging over many realizations, 
we have 

= n{mi)Ami 



and hence 



^a,a' I mass tracer y 



In the continuum limit, we obtain our final answer, 



Am,n(m,)C||,(r 



C^' I mass tracer = y I dmn{m)Ct% (™) (20) 



where C^^, is given by equation l|19() . If the binning function satisfies '4'a{v)''l'a'{v) — ^a,a''4'a{v) (^-g- non-overlapping 
top-hat bins) equation ((T^ simplifies to 



Cs!s'i^) = SaM'ill'alm) - i-^PaHit^s'lm). (21) 

Our expression for the contribution to the correlation matrix from P(r]\m) makes sense. If P(r]\m) is very narrow, 
then a given mass m will always get assigned to one and only bin a. In that case, {ips\m) = 1, and Eq. H21|) shows 
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that the correlation matrix vanishes. That is, if the mass tracer is perfect, then there is no uncertainty associated 
with it. In the more reahstic case, a mass m will sometimes be assigned to more than one bin. Then, the second 
term in Eq. H21(l would be non-zero even \i a ^ a' This would lead then to an anti-correlation between these two bins. 
Because of the leakage into adjoining bins, {i}}a\'m) < 1 and the diagonal term in C^"^ also becomes nonzero. 
What about density-bias terms? Going back to equation ((SJ, we can write the bias as 



niV 

A procedure analogous to the one before yields 



Converting this into an integral, we get 

CS% [ dmn{m)h{m)Ct%[m) (22) 

Incidentally, note that the correlation matrix is still symmetric despite the appearance of a factor 1 /n|., . The matrix 

element across the diagonal is Cgi^g, which also contains the same \/n%, factor. All other terms in the expression are 
clearly symmetric. 

Finally, performing the same analysis yields the bias-bias contribution, given by 

C%, = j dmn{m)b{mfCt%{m). (23) 



IV. OBSERVATION RELATED UNCERTAINTIES AND SYSTEMATICS 



We have derived above expressions for the density and bias of halos binned according to an unspecified mass tracer 77, 
along with the related intrinsic uncertainties assuming a volume limited survey. We have, however, been assuming no 
contamination, 100% rate detection (completeness), and the ability to observe 77 precisely. We wish to incorporate into 
our formalism uncertainties and systematic effects arising from observation. In particular, we assume observational 
effects may be characterized by the following information: 

• The average detection rate (completeness) and its variance C^'^/ . is defined as the fraction of objects in 
bin a which are identified. 

• The average false detection rate fs and its variance C^^, . Js is defined as the number of spurious objects per 
unit volume in bin a. 

• The probability q{ri\iif) that a halo with observed rj has a true value rf . 

Note we made the simplifying assumption that both and fs are position independent, though it is straightforward 
(albeit cumbersome) to extend the formalism to include position dependent detection and contamination rates. 

We begin by incorporating detection and contamination rates into the formalism. Further, since the true observable 
is not the density of objects but rather the number of objects in a given bin, we modify our formalism appropriately. 
Given our assumptions, the number of objects identified in a volume limited survey takes the form 

Na^{rsn\ + fa)V (24) 

where is the average number of objects in bin a. The corresponding correlation matrix is then^ 



Throughout, we are using the fact that 

Terms involving products of correlation matrices are being ignored as second order terms. For the expression above, ai,bi, and Ci are 
all uncorrelated with each other and possess different probability distributions (which is the case of interest here). 
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' a ' a' y 

where (AF)^ is the variance in the volume arising from uncertainties in the measured redshifts of the clusters. The 
result is just what one would expect: the uncertainties for a perfect algorithm CS'l, are scaled by the detection factors. 
In addition, one adds the uncertainties due to imperfect knowledge of the detection and false detection rates. Finally 
one needs to include the contribution from uncertainties in the sampled volume arising from redshift uncertainties. 
The expression above simplifies when the detection and contamination rates of different bins are uncorrelated, in 
which case one obtains 



O = 5,,,\{N, - hVf^-^ + {VAhr} + f,f,,V'C^-,, + N,N,S-^. (26) 

We wish to incorporate now systematic effects and/or uncertainties in the assigned cluster richness arising from the 
observations. In particular, we wish to include the fact that binning of data is done in terms of the observed value of 
r] rather than the true value 77*. We view the observations as providing a random mapping 77(77*) where a value rj has 
a probability q{ri\ri^) of occurring. The probability q{i^\rf) is assumed to be known from experimental calibrations, 
e.g. if we consider X-ray temperature measuremetns for a cluster, g(T|T*) describes how a series of measuremnts Tj 
of a cluster with temperature T* is distributed. 

Consider then the number of objects in a bin a. All we need to do then is replace ipsiv*) by ip{r]) where rj is 
the observed value. The relevant probability for computing ^/'^■(m) = {'^a{vi)\^) is not P{Tf\'m) but P{ri\m), the 
probability of observing a value rj given a halo mass m. This is given by 

Pir,\m)=Y,<liv\v')P{v'\m). (27) 
77* 

By replacing i] by 77* and P{rf\m) by P{r]\m) in formulae, we obtain expressions for the number density and bias 
of halos binned according to the observed value 77. Any systematic effects and/or uncertainties introduced by the 
observations will automatically be taken into account in the convolution of q{r]\rf) and P{rf\m). Note though that 
an application of our formalism requires an understanding of the probability q{Ti\rf), presumably characterized in the 
experiment's calibration. 

Let us now turn our attention to bias. The density of detected groups in bin a, which we shall denote (note the 
superscript * is missing), is given by 

na{x) = rgnKx) + /s 

where we are assuming a detection rate and a false detection rate fs, known from calibration to have values and 
/s and variances C-^g, and Cl^^,.^ With these assumptions, the observed correlation fmiction is given by 

^s^S'Ca' = rsf^'^lnl,ed7- (28) 
Replacing = &s^a:'6in, and dividing through by fianw we obtain 

ill' = ^s^s'^iin (29) 

where 

Na - faV 



Note we are making use of our simplifying assumption that rg and /g do not depend on position. Otherwise, there would be systematic 
corrections to equation I28i due to the correlation functions of rg and /g. These effects are easily incorporated, but we choose not to 
in the spirit of simplicity. 
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The expectation value of the observed bias is therefore simply 

(31) 

The observed bias is thus expected to be lower than the true bias due to the dilution of true clusters with false 
detections. The clustering signal, however, does not depend on rg. This is not surprising, as rg changes only how 
many clusters we observe, but not their clustering properties. Of course, a lower rs will lead to more noisy estimations 
of the bias (see equation 

Now that we have our expression for bias, we may compute the corresponding uncertainties. Using equations l|24fl 
and we obtain 

Ct^' = C"^!"^ ) irEVCt,,). (32) 

We can understand the expression above qualitatively. The factor fgV in front of C'^^, scales the correlation matrix 
to the total number of real clusters as opposed to number density. The prefactor corresponds to the scaling from the 
true bias to the observed bias. 

Finally, this same type of analysis gives us the bias-bias terms of the observable's correlation matrix, resulting in 

Ct,^{^^){^^^)ct,. (33) 

In addition to the errors above, there is a contribution coming from actually attempting to estimate bias from data. 
For instance, assuming the bias is measured using the power spectrum involves Fourier space discretization. This, in 
turn, brings its own set of sample variance errors, as well as Poisson errors in the number of objects found in a Fourier 
pixel. The corresponding errors in the power spectrum estimation are given by (see e.g.j^ HH. l4fl||): 

V bs ) VVk\ ^ ns{bs)^P{k)J ^ ' 

where V is the volume of the survey, and Vk is the volume of the corresponding fc-shell in Fourier space, i.e. 
Vfc « Airk'^Ak where Afc is the minimum spacing between k modes. Note the error estimate above Abg(k) depends 
on the wavenumber k used to estimate b^. One may average the estimated b{k) over a large k range to obtain smaller 
errors. This diagonal contribution to the bias-bias correlation matrix must be added to that in equation (|16|l since it 
represents uncertainties in the experimental estimation of the bias parameter. 

V. APPLICATION: GROUP AND CLUSTER STATISTICS IN A VOLUME LIMITED GALAXY 

SURVEY 

We now apply our formalism to a hypothetical cluster catalogue obtained from a large volume limited galaxy survey. 
We are interested in particular in what kind of information we can extract from such a catalogue. We present below 
our assumptions as to how the hypothetical catalogue is built, followed by how the halo model formalism is applied 
in this particular case. We also present the fiducial model used in the next section to derive the type of constraints 
one can expect for these type of surveys. 

A. Assumptions on the Cluster Catalogue 

Given a galaxy sample, one may attempt to identify groups and clusters of galaxies within it. While the notions of 
clusters and cluster richness are intuitive, ultimately one needs precise definitions to obtain a well defined sample. This 
task is achieved via automated cluster finding alg orithms. At present, there exist a large number of said algorithms, 
e.g. maxBCG Hybrid Match Filter (HMF) '281, Cut and Enhance method (CE) '20], Vornoi Tessellation Technique 
(VTT) C4 algorythm ( 38), ,37,]), and Friends-of-Friends (FoF) J^], aU of which simultaneously identify clusters 
(based on a specific set of criteria) and assign a richness measure to the identified clusters. The richness measures 
are typically the number of galaxies assigned to the cluster, or some measure of the cluster's total optical luminosity. 
If we identify clusters with massive halos, the richness measure provides then an observable which serves as a mass 
indicator. We can thus use our formalism to describe statistical properties of cluster catalogues by using the richness 
as the mass tracer rj. 
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FIG. 1: The assumed detection and false detection rates are shown here with the soUd and dashed Une respectively. Note the latter is 
defined here in terms of the percentage of identified cluster that arc false detections. 



In this paper, we assume a cluster finding algorithm having a specified set of criteria to determine when a galaxy 
belongs (is assigned to) a cluster. The number of galaxies N assigned to a cluster will be our richness measure. 
Examples of this type of algorithm are maxBCG, C4, Friends-of-Friends, Vornoi Tessellation Technique^, and the cut 
and enhance method. For our hypothetical catalogue, we will not worry about what the exact criteria for cluster 
membership is for a galaxy: we simply need to assume such a criteria exists. 

Within this framework, then, the mass tracer is the number of galaxies in a halo while the observed richness 
measure is N, the number of galaxies assigned to a cluster by the cluster finding algorithm. q{N\N*) is the probability 
that the cluster finding algorithm will assign N galaxies to a halo containing galaxies. Likewise, fg and fg are the 
detection and false identification rates for the cluster finding algorithm. * 

We make the further assumption that fg and fs are uncorrelated between various bins and with each other. 

It is worth pointing out that at least some cluster finding algorithms (e.g. match filter algorithms) have a detection 
rate that is dependent on the galaxy background (see e.g. Kim et al. |23|). We will ignore this effect here. To include 
it, one could imagine the detection rate having the form = fs(l + "fSg) where Sg is the galaxy density contrast and 
7 is a constant. Notice 7 is a measure of whether is strongly dependent on the background density or not. Using 
6g !v S (since galaxies are unbiased tracers of mass) , we could replace the above expression in equation 22 and rederive 
the corresponding uncertainties as in the previous section. This would add terms proportional to 7, which may be 
neglected in the limit that 7 goes to zero. We do not expect this effect to have major consequences in our results. 

Let us then specify the characteristics of our hypothetical cluster finding algorithm. We will assume that the non- 
detection rate (i.e. 1 — f^), the false detection rates, and their errors, are all power laws as a function of the number 
of galaxies in the cluster. Thus, e.g., the detection rate is taken to have the form 

rjv - 1 - (N/No)-^ (35) 

where A'o and 7 are constants. The normalization is specified by indicating their corresponding values for clusters 
with 5 galaxies and clusters with 50 galaxies, shown below, as well as the corresponding values A^o and 7. Figure 
plots the corresponding rates. 



^ The Vornoi Tessellation Technique as given in Kim et al. I28l assigns richness using the match filter method. Nevertheless, one could 
imagine using the number of galaxies N that the VTT technique assigns to the cluster as a richness measure. 

^ We note though that the membership criterion will in general affect the expected mass-richness relation. e.g. the number of member 
galaxies will clearly depend on the radius used to determine membership. 

* Note we are not considering q to give rise to false detections or imperfect detection rates. If a cluster is identified, but its richness is 
mislabelled, that effect is encoded in q{N\N*). On the other hand, a cluster that is broken up into two smaller clusters, or merged 
with another cluster to produce a larger one cannot be considered as a mislabelled cluster. In particular, these last two effects would 
greatly alter the richness and ruin the one-to-one and onto nature of the mapping between N and A'^* that we have been assuming. The 
inclusion of detection rates and false detections here serves to naively account for these effects. 
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Note that the false detection rate corresponds to the percentage of detected clusters in the fiducial model which 
are false detections. Thus, e.g. we are assuming that 30% of all detected clusters with 5 member galaxies are false 
detections. The correct values for the different type of algorithms vary, but we believe that the numbers above should 
provide a fair picture of the capabilities of cluster finding al gori thms at low redshifts. Details on particular algorithms 
may be found in the references (see e.g. [i,|2|,|23,|2|,|33,|33,|2|). 

Also note that when we apply our formalism we will need to assume that the data is binned into various richness 
classes, i.e. into bins of clusters containing N galaxies where Nmax > N > Nmin- We will use the same detection rate 
for all objects in a given bin, the detection rate being defined as the average rates for clusters with richness Nmin and 

Nmax • 

Finally, we need to specify the probability that the algorithm assigns N galaxies to a cluster given that its parent 
halo has iV* galaxies (i.e. what we had called 9(»7|?7*) earlier). It is difficult to find within the literature expressions 
for this probability. Here, we assume that the number of galaxies assigned to a halo takes the form N — N* + 5N 
where SN is a random variable with an exponential distribution.^ In other words, we take q{N\N'^) = P{SN) with P 
given by 

P{6N\N^) = Aexp(-a(iV*)|(5iV|). (36) 

The parameters A, a in the above expression are determined by the condition that the probabilities add to one, and 
by requiring that the expectation value of \6N\ be 10% of Note this distribution is wider than a Gaussian. 

It is often the case that cluster finding algorithms systematically underestimate the number of galaxies of a cluster. 
This effect is easily accommodated by correcting our expression for N to N = f{N*) +6N where /(iV*) is the average 
number of galaxies assigned to clusters with N* galaxies. Since the only effect this brings about is a rescaling of 
the axis, we do not expect our conclusions to be changed due to possible biases in the galaxy assignments of cluster 
finding algorithms (provided, of course, that they are appropriately calibrated). 

As a closing note, we would like to emphasize that all results presented here depend on the ability to accurately 
calibrate cluster finding algorithms. In particular, recall the parameter a(iV*) in equation 1361 is determined by de- 
manding the expecation value of \6N\ to satisfy (|(5A^|) ~ cN*" where the value c — 10% was arbitrarily chosen. In 
general, marginalization over the c parameter as determined from calibrations will also be necessary, leading to a 
degration of the confidence regions presented here. 



B. The Halo Occupation Distribution 

In order to apply our formalism, we need an expression for P[N^\m), the probability for a halo of mass m to have 
iV* galaxies in it. This probability is known as the Halo Occupation Distribution, or HOD. In accordance with the 
results of Kravtsov et al. 31], we assume all halos have a central galaxy, while any other galaxies found in the halo 
(referred to as satellite galaxies) are Poisson distributed with an average number 

{Nsat\m) = (m/Mi)". (37) 

Here Mi is the normalization parameter. It represents the mass of halos at which one expects, on average, to find 1 
satellite galaxy. Prior galaxy formation simulations in which the distinction between host and satellite galaxies was 
not made agree with the results from Kravtsov et al. in that the number of galaxies in halos with large occupancy 
numbers is Poisson distributed with the average number increasing as a power law jQj. Furthermore, the power law 
assumption for {N\m) has been used to model the galaxy correlation function for both the 2dF survey and the SDSS 
survey with very good agreement (see e.g. Magliocchetti and Porciani 'ss'l for the 2dF and Zehavi et al. |5Q| for 
the SDSS.) One may also wonder whether there is evidence that the probability P{N*) of a halo does depend indeed 
exclusively on the mass. Once again, simulations seem to indicate that this is indeed the case 0. 



There is no reason to choose the distribution we used other than it is simple. We expect, however, that this distribution is at least 
qualitatively correct. 

Again, the average value of \5N\ is arbitrarily chosen but we expect it to be representative. 
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C. Fiducial Model 

1. Cosmology 

The cosmological parameters used in our fiducial model are: 

D.m = 0.3, f^A = 0.7, = 0.049 
CT8 = 0.85, /i = 0.7, n = l. 

Here, n is the slope of the primordial power spectrum, which is filtered by the transfer function formulae from Hu 
and Eisenstein |2ll |. 

We also need to specify the halo mass function n(m) for the chosen cosmology. There are different prescriptions 
for obtaining n{m)^ the most well known being that of Press and Schechter ^41j. Two other and more accurate mass 
functions are widely used in the literature, namely that of Sheth and Tormeii|4^ . and that of Jenkins |2^. We use 
the Sheth- Tormen halo mass function since it can be physically motivated using elliptical collapse. The Sheth- Tormen 
halo mass function is given by: 

m dm 

where 

/(^) = Ail + exp(-|^). (39) 

The value A — 0.3222 is obtained from numerical fits to A^-body simulations, and recall from §11 that p — 0.3, 
q = 0.75, and ly = 5l^/cr'^{m). 



2. HOD 



We choose a halo occupation distribution in accord with our previous discussion, i.e. a host galaxy plus Poisson 
distributed satellite galaxies for all halos. The average number of satellite galaxies is given by equation H37|) with 

Ml = 6.0 X IO^^Mq a =1.0. 

The choice of the HOD parameters requires a little discussion. Both of the HOD parameters a and Mi for large 
halo masses have been obtained empirically by doing halo model fits to the galaxy-galaxy correlation function as 
measured by the SDSS (see Zehavi et al. ^|) and the 2dF survey (see Magliocchetti and Porciani js^). Zehavi et al. 
find a slope of a « 0.89, while Magliocchetti and Porciani find a « 0.9 for old galaxies, and a « 0.6 for star forming 
galaxies (see [S^ for details). Finally, Berlind et al. Q find a ~ 0.9 on the basis of numerical simulations. In all of 
these fits, however, no distinction between a central galaxy and satellite galaxies was made in any of these studies. 
Kravtsov et al. showed that in their N-body simulations this distinction gives better fits in N-body simulations, while 
raising the slope from 0.9 to 1.0 31]. We have therefore opted to use a slope of 1 as the fiducial model in accord with 
dark matter simulations.^^ Note that there is evidence that the slope varies with galaxy type (early vs. late, H^), so 
clearly the correct value of the slope will depend on the exact sample of galaxies we are looking at. 

The mass parameter Mi likewise depends on the particular galaxy sample under consideration. Most importantly, 
it depends critically on the intrinsic luminosity cutoff of the sample. For galaxies with intrinsic luminosity M^ < —21, 
Zehavi et al. find the value Mi k, 1.0 ■ IO^Mq.^^ The magnitude limit above corresponds to a galaxy density 
~ \Q~^h^ gal/Mpc'^. Kravtsov et al. find a similar value for Mi at said density. For a density n = 2.79 • \Q~^h^ 
Mpc~^, corresponding to galaxies brighter than « — 18 (Blanton et al. ^0]), Kravtsov et al obtain Mi = 5-lO^^M0 
[3l|. Using semi-analytical and SPH simulations, Berlind et al find Mi « 7 • lO^^M© for a comparable galaxy 
density.^'^ We assume here a value Mi = 6 • lO^^Af0 for the fiducial model. 



We note here that baryon cooUng may lower the value of a. For instance, in massive halos, the cooling time may approach the hubble 
time, which may reduce galaxy formation efficiency. However, since we are not aware of either observational or galaxy formation 
simulation constraints on the halo occupation distribution where a distinction between host and satellite galaxies is made, we have 
opted to keep the value of a obtained from dark matter simulations. 

The value M\ they quote is lower, but we have corrected it to take into account the assumption of a central galaxy. Zehavi quotes that 
at a mass of fa 4.5 ■ 1O"M0, one expects 5.4 galaxies, corresponding to 4.4 satellite galaxies, or Ml Ri 1.0 ■ 1O"M0 if o = 1. 
Note what we are calling M\ here corresponds to M^rit in the Berlind et al. paper, i.e. the amplitude of {N\m) at high masses. 
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No. of Goloxies 

FIG. 2: We show above the predicted richness (soUd line) and observed bias as a function of richness (dashed line) for our survey given 
our fiducial model. The richness function is defined as the number of clusters found having more than the specified minimum number of 
galaxies. 



D. Survey Assumptions 

We assume a 10^ deg^ sky survey which is the target SDSS coverage [s^, and a volume hmit of 2: < 0.1. In the 
fiducial cosmology, this corresponds to a volume of about 6 ■ 10^ Mpc^. At our volume limit of z = 0.1, and for the 
SDSS telescope, one expects all galaxies with intrinsic magnitude M ^ —18 to be detected and have their redshift 
measured^'' justifying our choice for Mi in the fiducial model. Notice that while the volume V has some error /S.V 
arising from the fact that redshifts have some intrinsic error, we have set this term to zero assuming spectroscopic 
redshift of the galaxies is available. This is reasonable given our very shallow survey, though the error AV will become 
non-negligible in higher redshift surveys. 

Finally, we assume all clusters in the catalogue are binned logarithmically into 20 different bins. We have checked 
this binning is fine enough to accurately contail all the information in our survey. The lowest and highest richness 
classes we consider are clusters with 10 and 120 galaxies respecitively, which corresponds to a mass range between 
Mmin ^ 5 • IO^^Mq and M^ax ^ 7 • The predicted cluster richness function is shown in figure[21 

One final note: we are considering bias to be our observable. To obtain the bias, one needs to fit the correlation 
function as a whole as ^ = b^£,LiN (note £,lin varies with cosmology). The fit gives the bias measurement, while the 
shape information is discarded. We expect then that the confidence regions we obtain here may be improved upon 
when the full information of the correlation function is used. 



VI. RESULTS 



A. Determination of the Confidence Regions 

Now that we have the correlation matrix of our observables, we estimate the Fisher matrix for the HOD and 
cosmological parameters as (see e.g. |l(i]) 

where Aa stands for the various parameters of interest and Oi label our various observables. As explained in |l6l |. the 
matrix F serves as an approximation of the inverse correlation matrix for the parameters Aa, which we use to compute 



See |http:// www, sdss .org/ documents /goals . html | 



14 



1 .0 
0.8 

0.6 

E 
C 

0.4 

0.2 
0.0 

2 A 6 8 10 12 

M, (10^2 MJ 

FIG. 3: The solid lines in this figure delimit the 95% confidence region marginalized over all parameters. Even moderately strong priors of 
Aa = 5% and Acg =0.1 do not improve the constraints much, though holding them fixed collapses the confidence region to that enclosed 
by the dashed lines. Finally, the dotted line corresponds to the contour Mi/Qrn = constant, the constant being set by the fiducial model. 
The 95% confidence limits on C^M"^ are C^M"^ = 5.0+2 g ■ IQ-^'^Mg ^ (marginalized over a, ag) and (ImM''^ = 5.00+q ■ IQ-^^Mq^ 
(a, as fixed). 

the confidence regions reported here. Our observables in this case are the number of clusters found in each bin and 
their bias, and the parameters of interest are those specifying the cosmology and the HOD. The one caveat is that, to 
obtain the confidence regions, we will use as our parameters not the model parameters themselves (i.e. a, Mi, erg,...) 
but their logarithms. There are two motivations behind this: 

• Using the natural logarithms enforces positivity in all parameters. 

• Traditionally, cluster surveys have been used to obtain constraints of the form crs^m = constant. These type 
of constraints are obtained as eigenvectors of the fisher matrix when the natural logarithm of the parameters of 
interest are used in computing the fisher matrix, so using logarithms should make comparison to previous work 
more straightforward. 

Unless stated otherwise, all of our confidence regions and intervals will be marginalized over h, n, and n^h'^ using 
gaussian priors 

ah = 0.1 (7„ = 0.1 an,h2 = 0.002. 
B. All — ^m. Degeneracy 

The first thing we notice upon computation of the Fisher matrix is the existence of an extremely large degeneracy 
between Mi and flm of the form flm/Mi — constant. This is shown graphically in figure |21 Here, we plot the 95% 
confidence region in the flm — Mi plane when marginalizing over a and erg (solid lines) and when holding them constant 
(dashed lines). The dotted line corresponds to the equation Qm/Mi = constant, the constant being set by the fiducial 
model. The 95% confidence constraint marginalized over all other parameters is flm/Mi — (S.Olj'g) ' 10~"'^'*Mq^, a 
rather poor constraint. The perpendicular direction Sim Mi = constant is not constrained at all. 

The reason for this degeneracy can be traced to the behavior of the mass function with Qm and the fact that our 
observable (number of galaxies in a cluster) scales as m/Mi. To see this mathematically, first note that for the halo 
mass function of Eq. 1381 

dmn{m, rim) ~ dxF{x) (41) 

for X = m/fljn and F being some function. Equation 1411 can be verified by referring back to equation 1381 We see 
there that n{m, flm) takes the form 
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FIG. 4: The 95% confidence regions marginalized over the Hubble rate, spectral index, and baryon density using moderate gaussian priors. 
The outer dashed ellipse is also marginalized over Mi and Qm without assuming any priors for these quantities. The inner solid ellipse is 
obtained assuming gaussian priors Qrn = 0.3 ±0.1 and AMi/Mi = 30%. The corresponding 95% confidence intervals are a ■ 
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dmn = {fl^dx)^^f{v) (42) 
X dm 

where v = 6s(? / cr{R{m))'^ . Since R{m) is given by the condition A-KR{mYVtmPc — 3m, it is clear that R{m) 
depends only on x. Thus, if the power spectrum were independent of Vim, o'{R{'m)), and hence n, would depend only 
on X. There is then a small dependence of a{R{m)) on f2m alone which comes through the power spectrum used in 
the convolution. This small dependence makes equation 1411 onlv approximate. 



Now, for a given mass m, the number of galaxies in the halo - and therefore the value of {ips\m) - depends via 
Eg. 1371 onlv on (m/Mi). That is, {il}s\m){m) — g{m/Mi) for some function so 

ns = J dmn{rri,Q,m)g{m/Mi) 
« j dxF{x)g{n„,x/A'h) 

= cl>{n^lMi) (44) 

where (/> is a function. We see then that fig depends only the ratio Afi/ilm, so there is indeed a full degeneracy 
between the two parameters. Since the argument assumes only that the mass tracer scales as a function of m/A'Ii 
where Mi is a characteristic mass scale, we expect this degeneracy to be quite general. It is also easy to check that a 
similar computation holds for bias. 



C. Constraints on a and ag, 



Despite the strong degeneracy between Mi and , it is still possible to constrain a and cts without the use of any 
further priors. Shown in figure 0] with the dashed line is the 95% confidence region of the a — erg plane marginalized 
over r2,„ and Mi assuming no priors for either of these two quantities. The constraints improve only slightly when 
invoking priors on either Q.m or Mi. Finally, the solid contour encloses the 95% confidence region with reasonable 
priors Af2,„ = 0.1 and AMi/Mi = 30%. 
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FIG. 5: The 95% confidence regions above are marginafized over the hubble rate, spectral index, and baryon density using moderate 
gaussian priors. In addition, we have assumed a gaussian prior for the Mi parameter of width AMi = 50% (dotted hne), AMi = 30% 
(solid line), and AMi = 15% (dashed line). The inner curve (dash-dot) is obtained by fixing h and using the 30% prior on Mi. 

D. The as - Qm Plane 

We have seen it is impossible to obtain information about fim and Mi simultaneously due to a degeneracy of 
the form flm/Mi = constant. For our fiducial model, we saw in particular that Mi and flm satisfy rtm/Mi = 
{5.0^2.1) ■ 10^^''A/q^ (95% confidence). The fact that ^m/Mi is so poorly constrained is reflected in the constraints 
one may obtain on fl„i when assuming priors on Mi and vice-versa. It seems then that local cluster abundances are 
not well suited to constrain either f2„i or Mi. 

Up to date, most of the work on cluster abundances has been aimed at providing confidence regions in the as — 
plane (see e.g. Eo|,l53|,Q, [l^l and references therein). It is helpful then to study the type of constraints we can 
place in the Qm — plane, not only to touch base with other work in this area, but also to see whether we can indeed 
expect to constrain cosmology with local cluster surveys. 

We show in figure our constraints, where we plot the 95% confidence regions in the flm — cg plane for various 
gaussian priors on Mi (marginalization over all other parameters is also done just as before). The dotted line is for 
a gaussian prior with AMi = 50%, the solid line is for AMi — 30%, and the dashed line for AMi = 15%. The most 
strongly constrained combination of erg and f2m using a 30% prior on Mi is crgOj^^^ w const. Notice this constraint 
is rather different from the one typically found in the literature from local cluster abundances, which looks more like 
ug^Jn ~ constant with 7 0.5. This is not surprising, however, given that there are major differences between our 
analysis and most previous treatments, including the use of clustering properties (bias), the probing of lower masses 
in the halo mass function, and a marginalization over h, which is often fixed in cluster abundance analysis. In light 
of this last point, we also plot in figure [S] the expected confidence region obtained when h is fixed and a 30% gaussian 
prior on Mi is used (dash-dot curve). It should be evident that marginalization over h is extremely important, and 
that constraints derived by holding h fixed are over-optimistic. 

The main point that should be very clear from figure |31 is that, because of the very strong — Mi degeneracy, 
the constraints that we can place in are entirely determined by how well can the amplitude of the mass-richness 
relation be calibrated. Further, we expect this to be a generic feature of all cluster abundance studies (regardless of 
the mass tracer used) since we believe the degeneracy stems from how the halo mass function scales with flm- 

E. Constraining Cosmology With Cluster Statistics 

We have derived above the various constraints on the trg — Qm plane that we expect to obtain from our model survey. 
Importantly, we saw that marginalization over h is necessary to avoid overly optimistic constraints. In light of this 
sensitivity to h, it seems worthwhile to determine what combination of erg, and h is most strongly constrained by 
the data (assuming a prior calibration of a and Mi). We compute these combinations of parameters using a principal 
component analysis of the estimated correlation matrix. We find then that, for gaussian priors AMi — 30% and 
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FIG. 6: The 95% confidence regions in the a — M\ plane. Moderate gaussian priors Am = An = A/i = 0.1 and Acrg = 0.20 were used for 
the solid line, while the dashed line uses Af2m = An = Ah = 0.05 and Acrg = 0.15. The inner, dotted ellipse fixes cosmology. The 95% 
confidence regions for a when moderate priors are used is a = LOOO^gpyg. If one keeps cosmology fixed. Mi can also be constrained, and 

the corresponding 95% confidence regions are Mi = 6.00^Q gg ■ IO^'^Mq and a ■ 
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Not unexpectedly, the best constrained mode is mostly crgj but with an important contribution from h (if this were 
not the case, marginalization over h would have had little effect before). We can see as well that il™ is the most 
poorly determined parameter, which makes sense given the large assumed uncertainty in Mi and the strong Afi — flm 
degeneracy. The constraints on h that one can obtain with this method are at a moderately interesting level. If 
one introduces a prior AO^ — 0.1, the 68% confidence interval obtained from cluster statistics is h = 0.7 ± 0.09, 
comparable to h = 0.72 ± 0.02 ± 0.08 from supernovae (Freedman et al. 19] ) or h = 0.72 ± 0.05 from the CMB 
(assuming a ACDM cosmology, see e.g. Spergel et al. 0)- 

In view of the above comments, it is clear that aside from consistency checks, the most important contribution 
from local cluster statistics for the purposes of determining cosmological parameters is the accurate determination of 
(78. In particular, using reasonable priors for all other variables we expect it will be possible to constrain erg at the 
« 10 — 15% level with a 95% confidence level. 

F. Constraining the Halo Occupation Distribution 

Let us consider now the complementary problem of constraining the halo occupation distribution by either fixing 
cosmology or marginalizing over it. These constraints should be of great interest in that they may help guide 
theoretical efforts in galaxy formation models. Our results are shown in figure where we plot the 95% confidence 
regions on the a — Mi plane when we hold cosmology fixed (dotted line), marginalizing with strong gaussian priors 
Ai7,„,An, A/i = 0.05 and Aas — 0.15 (dashed line), and marginalizing over moderate priors Ari,„,An, A/i = 0.1 
and Acts — 0.20. As we expected. Mi is poorly constrained and its confidence interval depends on the prior used 
in Q„i- The slope a on the other hand, can be well constrained. Our moderate priors lead to a 95% confidence 



By reasonable priors wc mean Aa/a = 10%, AMi/Mi = 30%, Af2„ 



: 0.1, Ah = 0.1, and An = 0.1. 
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Assumptions 


Aa 


Acts 


Ah 


AQm , An = 0.1 


±0.038 


±0.079 


±0.089 


An,n,An = 0.05 


±0.032 


±0.048 


±0.070 


Anm,An,Ah = 0.1 


±0.036 


±0.080 




Anm,An,Ah = 0.05 


±0.028 


±0.037 





TABLE I: 1 — (T predictions for a, ag, and h from cluster samples obtained with SDSS type surveys. All error bars assume gaussian priors 
AQhh^ = 0.002, and AMi = 30% as well as the assumptions listed on the table. 



interval a = 1.000I^;^?5. If one insists in keeping cosmology fixed, however, both a and Mi may be constrained 
to a good accuracy. In particular, for fixed cosmology the 95% confidence regions become a = l.OOOlogas) ^^"^ 
Ml = 6.00^^;^^ • 10^2 Afg. 



G. What Do Local Cluster Abundances and Bias Tell us? 



Given our above results, it seems fair to ponder as to what the best use of local cluster samples is. We have seen 
that bias allows us to constrain a and as simultaneously, though large degeneracies with h exist. We have also seen 
that cluster abundances and bias can only constrain the combination Mi/Qm, and only poorly at that. It seems then 
that rather than attempting to constrain cosmology alone or the mass tracer relations alone, the data is best used to 
constrain the three parameters a, erg, and h. 

We quote in tabled how well can we constrain the various parameters {a, erg, and h) under various assumptions. 
This is meant to illustrate the power of cluster samples obtained from SDSS type surveys. We assume in all cases 
gaussian priors A^l^h^ = 0.002, and AMi = 30%. The values listed under the assumptions column are the values 
used as gaussian priors for the appropriate variable. All confidence intervals are 68% and marginalized over all other 
parameters. 

Is this the best we can do? Yes and no. On the one hand, we can perform a singular value decomposition of the 
Fisher matrix assuming no priors on any of the parameters, thus determining which and how many directions are 
strongly constrained given our assumptions. We find that there are indeed three directions which may be strongly 
constrained, and these are most closely aligned to the a, tJg and h subspace. Thus, we cannot hope to place more than 
three strong constraints, and if we want to choose three of our parameters, our best choice is the triplet {a,as,h) 
considered above. Nevertheless, the three directions which are most strongly constrained all have some contribution 
from the parameters Mi/flm and n. In principle, then, we could do "better" if we opt for constraining combinations 
of the five parameters (a, erg, h, Mi/flm, n), though only three such combinations may be constrained. 



VII. CLUSTER ABUNDANCES 



A. The Role of Bias as a Complement to Cluster Abundances 



We wish to consider now the type of constraints we can place if we have cluster abundance information alone. 
Before we do this, it is important to address whether the inclusion of clustering properties (in the form of cluster 
bias) contains information not included in the cluster richness function. If not, repeating our analysis disregarding 
bias information would not alter any of our results. We demonstrate that bias does indeed carry some additional 
information by considering how well can a be determined from our data. 

FigurcHshows the 95% confidence regions in the a — erg plane, marginalized over all other parameters, and where in 
addition to the usual priors on h,n, and fif, we used priors AAfi — 30% and Aflm = 0.1. The solid lines are obtained 
by using cluster abundances but ignoring bias, while the dotted lines are obtained by ignoring cluster abundances but 
including bias. Also shown as a dashed ellipse is the curve we obtain when both bias and abundances are used. Note 
this last ellipse is the same as the dashed ellipses in figure 0] The important result then is that cluster abundances 
are degenerate in a — erg (acrg'^^ « constant) while bias breaks this degeneracy. This demonstrates explicitly that 
information contained in the bias complements that of cluster abundances by themselves. 

An important consequence of this argument is that, when bias information is not included, one needs to either fix 
a or assume some prior on it. We checked this explicitly by noting that the erg — flm confidence regions obtained from 
cluster abundance alone blow up if no prior is placed on a. 
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FIG. 7: The 95% confidence regions in the a — ag plane, marginaUzed over h, n, Qi,, Mi, and Qm with the usual priors on h, n, and Qi,, 
and gaussian priors AMi = 30% and AQm = 0.1 for the last two parameters. The solid line is obtained when using cluster abundances 
but no bias information, while the dotted line corresponds to the converse case. We see that bias breaks an a — erg degeneracy which exits 
when using only cluster abundance information. Shown for comparison, the dashed ellipse above is the same as the dashed ellipse in figure 
HI i.e. the confidence region obtained with both cluster abundance and bias information with the aforementioned priors on Mi and trg. 

B. Can Local Cluster Abundance Alone Constrain erg and fim? 

In view of the degeneracy between a and erg in cluster abundance considerations, we wish to consider to what extent 
the (Tg — rim plane can be constrained using cluster abundances alone. We have already seen that due to the Mi — fi^ 
degeneracy, the constraint on fl„i is determined in its entirety by how well can the amplitude of the richness-mass 
relation be calibrated. Likewise, we expect that the constraints on erg derived from cluster abundance studies to be 
entirely determined by the calibration constraints on a. This is shown explicitly in figure |S1 where we plot the 95% 
confidence regions in the erg — flm plane for three different gaussian priors on a: Aa = 10% (dotted line), Aa = 5% 
(dash-dot), and a — constant (dashed line). A gaussian prior AMi — 30% is used, and we marginalized over Q,bh? , n, 
and h with the usual priors. When computing these confidence regions, we disregarded all bias information. Also 
shown for reference in the solid line is the contour obtained when both bias and cluster abundances are used and 
assuming a fixed a. We note that when including bias information, the confidence cg — 17„ region is not considerably 
worsened by letting a float (not shown). 

It should be clear from figure |S1 that, as we expected, the erg confidence interval is entirely determined by the prior 
on a, just as the flm interval is determined by the Mi prior. Further, we note that the constraints on erg are rather 
weak for realistic priors on a. In particular, even when all parameters except a and CTg are fixed, the 95% confidence 
interval assuming gaussian priors Aa — 10% and Aa = 5% are erg — 0.85^0^26 ^8 = O.SS^q '^g respectively. 

There are thus two main results from this exercise. Firstly, cosmology can be constrained by optical cluster 
abundances only to the extent that a careful calibration of the richness-mass relation can be achieved. Secondly, 
we find that using realistic assumptions for the accuracy of the calibration parameters neither erg nor flm may be 
accurately constrained by cluster abundances alone. As a corollary, it is evident that any constraints placed on Qm 
and erg using cluster abundances which are not properly marginalized over Mi and a will lead to overly optimistic 
constraints (see figure [TT|) . 

C. What do Local Cluster Abundances Alone Tell Us? 

As in section FVI Gl we can now ask ourselves how we can best use cluster abundance information. We attack this 
question by finding the eigenvalues and eigenvectors of the estimated parameter correlation matrix. Upon doing so, 
we see that there are only two directions which are strongly constrained. These two eigenvectors are most closely 
aligned with the {acr^'^ — h) plane, but have contributions from ilm/Mi and n, as well as a very weak contribution 




20 



2.0 



b 1.0 




0.5 



0.0 



0.0 



0.2 



0.4 



0.6 



0.8 



1 .0 



Q 



m 



FIG. 8: The 95% confidence regions above are obtained using gaussian priors on f2i,,n, and h, and disregarding all bias information, i.e. 
these are the expected confidence regions from optical cluster abundance studies. Note in particular the scale on the eg axis. The various 
curves correspond to different priors on the richness-mass relation parameters. These are Aa/a = 10% (dotted), Aa/a = 5% (dash-dot), 
and Aa = (dashed). All contours are obtained using a 30% gaussian prior on Mi. For comparison, we also show the contour obtained 
when we add bias information and keep a fixed, shown here in the solid line. Compare also to figure |^ (though note the change in scale 
for the (T8 axis). 



from CTg. For reference, we write the eigenvectors below 



where n = 1 + 6n. Though these may not seem terribly illuminating, these expressions contain much information. 
Consider the first eigenvector: if we take all parameters except as and to be fixed, this eigenvector reduces to 
CTgriJ^^^ = constant, a cluster normalization condition. Thus, equation 1481 mav be thought of as a generalized cluster 
abundance normalization condition (see Appendix fXl for more discussion). 

The second eigenvector above, equation does not have a simple interpretation (though see Appendix Re- 
gardless, there are still elements which are of interest. Importantly, only the combinations flm/Mi and atig'^ appear, 
which are the degeneracies we have already found. This confirms that this degeneracies are indeed intrinsic to cluster 
abundance studies and cannot be avoided. 

As a final note and in answer to the question posed by the title of this section, we state here that cluster abun- 
dances are most well suited to constrain the combination acrg'^*^ and h. The constraints are somewhat sensitive to 
n and flm/Mi, but only moderately so. We show in table ITU the 68% confidence intervals for aa^'^^ under various 
assumptions. 



Perhaps the most surprising result so far is that local cluster abundances alone are not capable of constraining 
either Qm or erg very precisely when using reasonable priors for the richness-mass relation. This result, however, was 
obtained for optical cluster surveys, which amounts operationally to a choice of mass scale Mi and power law index 
a in the scaling of the mass tracer with halo mass. It is possible then that our conclusions do not hold in the case of 
X-ray surveys or other mass tracers, where one would have different values for Mi and a. Here, we wish to investigate 



In these expressions we have disregarded the very weak dependence on as alone. This dependence shows up as an extra factor of crj 
in both eigenvectors 




(48) 



(49) 



D. Origin of the Degeneracies 



Assumptions 


Aaag'"' 




0.090 


Aa = 10% 


0.074 


Aa = 10%, Acrg = 0.2 


0.064 


Aa = 10%, Acts = 0.2, Ah = 0.1 


0.050 


* 


0.053 


* + Aft = 0.05 


0.035 



TABLE II: 1 — a predictions for QfTg'^ from cluster samples obtained with SDSS type surveys. All error bars assume gaussian priors 
AQi,h^ = 0.002, AMi = 30%, An = 0.1, and AQrn = 0.1, as well as the assumptions listed on the table. The next to last row, marked 
with a * in assumptions, assumes AQm = 0.05 and An = 0.05, as well as priors Ao = 10% and Acrg = 0.2. In all cases where no prior on 
h is assumed, the 1 — cr interval for h is h = 0.7 ± 0.1, which reduces to h = 0.70 ± 0.08 for case *. 



where the degeneracies stem from to determine whether we expect them to be generic or particular to the fiducial 
model we have assumed. 



1. Ml — Qrn Degeneracy 

Let us begin by analyzing the Mi — ftm degeneracy first. We concluded in section IVl Bl that as long as the mass 
tracer scales with mass as some function of m/Mi for some characteristic mass scale Mi, then the degeneracy between 
Ml and flm will always exist. The only possible way out of this statement is that our starting assumption, namely 
equation 1411 is violated. We discuss then what conditions equation 1411 imposes on the mass function. 

Let us assume that equation 1411 holds and consider the product dxf{x) for for two values (m, r2,„) and (rn',ilj„). 
Equation implies 

dmn{m,D,rn) = dm' 71(171' , fl'^^^) (50) 

provided m' /fl'^ = m/rirn- Defining A = i7^/r2„j, we obtain m' = Am, which upon replacing on the right hand side 
above yields 

d7nn{m,nrn,) — d7nXn{X7n, Xflrn)- (51) 
In other words, equation 1411 holds if and only if the halo mass function satisfies the scaling relation^^ 

n{m,Q,m) — Xn{Xm, Xflm)- (52) 

Letting flm — 1, this relation simplifies to 

= l = ^^m"(f^m"^, f^m)- (53) 

which was indeed found by Zheng et al. 57] using extensive numerical simulations In fact, they found the correlation 
function scales in a similar way, supporting the fact that the Afi — flm degeneracy we found is not broken when bias 
is included as an additional observable. 

We are forced to conclude then that there is an intrinsic limit to how well can we constrain 0^ from cluster 
abundances which is set by the uncertainty in the characteristic mass scale Mi. Mass measurements at present are 
only accurate to about 20% (optimistically) to 50% (pessimistically) , which essentially sets the maximum accuracy 
one could achieve in f2m using local cluster surveys. Note, however, that we found that the ratio Mi/ilm itself was 
rather poorly constrained, so it seems unlikely that local cluster abundances can provide strong constraints on ri„i-^^ 



The converse is proved simply by reversing our argument. 

Note we are not making any such claim for cluster abundance studies extending over a large redshift range, since then it is possible to 
provide constraints using the observed growth of structure. 
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FIG. 9: Shown above in the sohd line is the Sheth Tormen halo mass function for our fiducial model (erg = 0.85). Also shown are the 
scaled mass functions (i.e. right hand side of equation 1571 for o-g = 1.0 (dashed line) and a'g = 0.7 (dotted line). For comparison, we also 
show the unsealed halo mass functions for erg = 1.0 and erg = 0.7 with the long-dash and dash-triple dot lines respectively. The scaled 
mass functions are seen to agree with each other to within 10% or better in the mass range 10^2 - 10^5 M0. The difference between the 
scaled mass function and our fiducial model mass function, divided by the halo mass function, is shown in the upper panel. Conventions 
are the same: a'g = 1.0 corresponds to the dashed line while erg = 0.7 is shown with the dotted line. 



2. a — (Tg Degeneracy 



Let us now turn to the a — as degeneracy. We perform an analysis similar to the one above to determine if the 
degeneracy stems from a scaling property of the halo mass function. 

Consider then a variation of {a, as) — > (a'jtTg) which leaves cluster abundances fixed. The relation between the 
parameters is easily obtained from a singular value decomposition of the Fisher matrix holding all parameters fixed 
except for a and erg. We obtain that cluster abundances are approximately degenerate when aa^'^ = a'a'^'^. 

Now, since our mass tracer scales with mass as a power law, the binning functions may be expressed in terms of a 
function g{a\n{m/Mi)). We have then 

Ns{a' , a'g^) (X J dmn{ra\a'^)g{a\n{m/Mi)). (54) 

We perform now a change of variables by defining m via aln(m/Afi) = a' ln(m'/Afi). Defining A = a /a' and 
replacing above we get 



Ns{a',a'^)(x j dm\{m/ Mi)^~^n{m' ,a'^)g{a\n{m/ Mi)) (55) 
which is to be compared with 



iVs(a, (Tg) oc / dmn{rn^as)g{ah\{rn/Mi)). (56) 



If we demand that Cg = A^ug, we have then that acg'^ = aVg*'^, and thus by construction Ns{a, erg) w Ng{a' , Cg). 
Since the function g is the same in both equations 1551 and 1561 this suggests that the kernel of the integrals are nearly 
degenerate. We thus make the ansatz 



— j n{Mi{m/A'h)^,X^as). (57) 

We check our ansatz using the Sheth- Tormen halo mass function. In particular, figure El shows the Sheth- Tormen 
halo mass function, computed for our fiducial model, as well as the scaled mass functions computed according to 
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equation 123 The smaller plot above shows the fractional difference between the right and left hand sides of equation 
1^ The right hand side of equation was computed for dg = 1.0 (dashed hne) and CTg = 0.7 (dotted line). We 
find that our scaling relations are accurate to within « 10% for erg in the range [0.7, 1.0] and over the mass range 
10^"^ — IO^^Mq. Even though the accuracy of the scaling relations decreases as we move away from this range, it 
is still better than 20% from IO^^Mq up to « 2 • IO^^A/q for as in the range 0.7 < as < 1.0. 

We conclude therefore that, in the mass ranges probed by our model, there appears to exist a scaling relation given 
by equation 1571 Further, running the argument we used to derive our ansatz in reverse proves that the degeneracy 
we observe does indeed stem from the scaling relational We thus expect our results to hold for all mass tracers that 
scale with mass via a power law and which probe the mass range 10^'^ — IO^^Mq. In particular, we expect a degeneracy 
between a and erg to exist for all cases (though recall bias breaks this degeneracy) as well as an Mi — degeneracy. 
This further implies that any determination of cosmological parameters done using local cluster abundances alone 
needs to be marginalized over uncertainties in how the mass tracer scales with mass, regardless of the choice of mass 
tracer. Note marginalization over the bubble rate is also necessary due to large mixing between the cosmological 
parameters for fixed cluster densities. 

To close, we note the fact that the scaling relationEZIdepends on Mi . This suggests that a more accurate degeneracy 
can be achieved by allowing Mi/57.,„ to vary using for example equations 1481 and 1491 Nevertheless, the relation we 
obtained works well, and is only weakly dependent on the value of AIi. Indeed, we can change Mi by up to a factor 
of two up or down and still get a reasonable agreement in the scaling relation [S7| It is likely therefore that the role 
of Ml is essentially to set the mass scale over which the halo mass function will satisfy scalings of the form 1571 

VIII. CONCLUSIONS 

We have derived expressions for cluster abundances and bias using the halo model formalism. In particular, we 
have shown how starting from the halo mass function and halo bias, we can obtain expressions for cluster statistics 
in terms of any mass tracer (e.g. X-ray temperature/luminosity, number of galaxies, etc.) in such a way that various 
experimental effects can be included in the formalism. Specifically, we include intrinsic dispersion of the richness- 
mass relation, bias and/or scatter due to experimental measurements, detection rates, and false detections. We also 
identified various sources of errors and derived the uncertainties one expects due to intrinsic scatter in the richness- 
mass relation. We believe that these derivations are important in that they allow us to compare theory directly 
to observations, without having to manipulate the observations in attempts to retrieve, for instance, the halo mass 
function. Further, they allow us to consider the possibility of using cluster statistics to constrain not just cosmology, 
but also how the mass tracer scales with halo mass. 

Having derived our formalism, we applied it to the case of large local optical cluster surveys. In this context, we 
have shown that optical cluster survey determinations of cluster abundances and bias can provide strong constraints 
on the amplitude of the power spectrum at cluster scales (as), the power law index on the scaling relation of the 
number of galaxies in a halo of mass m, and perhaps even the bubble parameter h (see table ^1. We argued as well 
that cluster abundances and bias are not well suited for constraining flm or AIi, the amplitude of the mass tracer 
scaling relation with mass. 

We have shown as well that one needs to be very careful when analyzing cluster abundances and bias data in 
order to avoid overly optimistic constraints. In particular, we have shown that realistic constraints on as need to 
be marginalized over h and, when bias information is unavailable, marginalization over priors on a is also necessary. 
Though marginalization over Mi has a much smaller effect on as, it becomes of paramount importance if one wishes 
to constrain Q„i- In fact, we found that the uncertainties in as and f2„i obtained from cluster abundance studies alone 
are entirely determined by the priors used on a and Mi. 

We have attempted to explain why is it that the uncertainties in as and flm sue driven by the priors on a and 
rim when only cluster abundance information is used. In particular, we have argued that this effect is driven by two 
degeneracies, one involving Mi and Qm, the other involving a and as- We have shown here that these degeneracies 
arise from scaling laws satisfied by the halo mass functions. The scaling law leading to the Mi — degeneracy was 
found empirically by Zheng et al. [s^ . but it is reassuring to see it re-emerge here in our Fisher analysis. The scaling 
law [pleading to the a — as degeneracy is, to the best of our knowledge, a new result. 

Finally, we argued that because the degeneracies above stem from intrinsic scaling relations of the halo mass 
function, our conclusions are valid for any cluster abundance study in which the mass tracer scales with mass as a 
power law to a good approximation. In particular, for any such studies, any constraints that one wishes to place on 
cosmology need to be properly marginalized over the bubble rate h as well as the amplitude and power law index of 
the mass tracer scaling relation. 

In summary, then, we have found that cluster abundances and cluster bias are powerful tools that can greatly 
constrain both the amplitude of the power spectrum at cluster scales, and how the number of galaxies in a halo scales 
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with mass. However, neither flm nor the characteristic mass scale for the formation of galaxies can be accurately 
constrained. In either case, it is always important to marginalize over h in order to avoid obtaining unrealistically 
tight constraints. 
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Cosmological constraints from cluster abundance studies are usually expressed in the form of the so called cluster 
abundance normalization condition, usually expressed as crgfJ^ « 0.5 where 7 ~ 0.5. In other words, cluster abun- 
dances are usually used to constrain the combination crgrj^. This combination of parameters, however, did not arise 
from our analysis. Where did the usual cts — degeneracy go? 



Let us consider first equation We stated in section IVII CI that ^H] could be though of a generalized cluster 
abundance normalization condition. While the exponent of is a little steeper than usual, we show below that this 
arises simply because we are probing rather low mass scales. 

We begin our analysis by determining the 95% confidence regions in the erg — plane obtained using only our 10 
richest bins (clusters having 35 galaxies or M > 2 • lO^^M©) while holding all parameters except as and ilm constant. 
This is shown in figure Uniwith a dotted line. The de generacy axis in the figure is crgri^^^ = constant, so we see then 
that we do indeed recover the cluster normalization condition when, a- we hold all other parameters fixed, and b- 
restrict ourselves to the most massive clusters. This degeneracy simply reflects the fact that the halo mass function 
at m ~ 1O^*M0 scales is degenerate according to the above expression (see e.g. Zheng et al. [s^l)- On this basis, 
one would expect that probing low mass scales would allow us to break this degeneracy, which is indeed the case. We 
show this in figure lTUl where we plot the confidence regions obtained when including lower mass clusters. In particular, 
the confidence regions shown with the dashed and solid lines in figure 1101 are obtained using all but the lowest six 
bins {Ngai > 21 or Al > 10^"* Mq) and all bins respectively. We see that with the inclusion of lower mass bins the 
degeneracy region is greatly reduced. Further, as we probe lower and lower masses, the axis which is least strongly 
constrained becomes steeper and steeper, going from 0.62 when only the 10 richest bins are used to 0.73 when all bins 
are used. 

We can gain further insight on the CTg — rim degeneracy by considering what happens when we allow Mi and a to 
vary, but keeping h fixed. The two most constrained directions are obtained from the estimated parameter correlation 
matrix are^^ 



Note that these two eigenvectors are not simply expressions 1481 and 1491 with h held constant. To see how they are 
related, it is best to pretend we know nothing about eouations 1481 and 1491 while we analyze the above eigenvectors. 
We can then go back and see how the pair of eigenvectors 1481 and 1491 are related to lAll and lA2l 

Let us then analyze the above eigenvectors, whose structure allows for a very simple interpretation. Take a and 
Ml to be fixed. Then, the first eigenvector becomes the cluster normalization condition asil'^'^ , while the second 
eigenvector is almost entirely rim- Since the low mass end of the halo mass function is essentially independent of ^m, 
this suggests the first eigenvector is driven by the high mass end of the halo mass function only, while the second 
eigenvector is driven by the low mass end alone. Indeed, the eigenvectors obtained when restricting ourselves to high 



We are again neglecting a small dependence on ag of the form a^^^ in equation IA2I 



APPENDIX A: WHERE DID THE USUAL as fi™ DEGENERACY GO? 



1. The as — Qm Degeneracy 




(Al) 



(A2) 



26 



1.10 



1.00 



0.80 



0.90 




0.70 



0.60 



0.15 



0.20 



0.40 



0.4.5 



0.50 



FIG. 10: The figure above shows two things: first, that when we restrict our analysis to erg and Qm holding all other parameters fixed, 
and considering only massive clusters [Ngai > 35) we recover the usual trg — Qm degeneracy (dotted line, 95% confidence). Also shown 
are the 95% confidence regions obtained using all but the lowest six bins (dashed) and all bins (solid). This illustrates the fact that 
probing low halo masses breaks the erg — Cl^ degeneracy by a considerable amount. Therfore, we do not expect to recover the usual cluster 
normalization condition since we are probing rather low halo masses. 

mass bins (clusters with 35 galaxies or more) are close to those above, except that we do not lose any constraining 
power for the first eigenvector, while the constraint on the second eigenvector is weakened by a factor of four. We 
conclude then that equation lAll mav be thought of as the constraint arising from matching the high mass end of the 
halo mass function, while equation IA2I arises from matching the low mass end. 

Are the vectors IXT1 and fX2l related to the vectors we found in section IVlI CP To approach this question, we can 
consider what happens to the vectors fSTI and IX2l when we let h vary. 

Consider first what happens at a qualitative level: h affects both the high mass end and the low mass end of the 
halo mass function. As such, we do not expect our eigenvectors to cleanly separate into matching one or the other 
end of the halo mass functions- they will mix. Indeed, when we let h vary, the first eigenvector becomes 



where 7 and 7 depend on the lowest mass scale probed, e.g. 7 = 0.73 when all bins are used (as we found in section 
IVII C|) while 7 — 0.62 when we use only rich clusters. 

We note several things: first, the vector IXSl is essentially identical to 1481 when n is held fixed. Interestingly, though, 
the crgJl^ degeneracy no longer has the constant exponent 7 « 0.62. The variation of the exponent comes about 
because of the mixing of the constraints from the low and high mass ends of the halo mass function, and hence our 
result above compromises by giving us the most strongly constrained direction in the erg — 57^ plane when all other 
parameters are held fixed. 

What about the eigenvector from expression IA2L '' Since h mixes the high and low mass end constrains, one may 
expect the eigenvector IA2l not only to be drastically altered (since it no longer represent the low mass end constraint), 
but also to be greatly weakened since the constraining power of the high and low mass ends of the halo mass function 
now goes into the first eigenvector. This is indeed the case. Despite this result, however, there is still a second highly 
constrained eigenvector, which is essentially that of expression 1491 with n fixed. This eigenvector is a new constrain 
that arises from allowing h to vary, and is thus not associated with the eigenvector from expression I A2I 



We argued throughout the text that the appearance of the a — ag, and Mi — 17^ degeneracies makes marginalization 
over the mass tracer scaling relation necessary if one wishes to avoid placing overly optimistic constraints on the 
cosmological parameters. A by product of this marginalization is that the characteristic shape of the as — 
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FIG. 11: The effect of allowing various parameters to vary in determining the ag — Qm confidence regions is shown above. All curves 
above are 95% confidence and hold n and Qi,h^ fixed. The inner solid curve also holds a, Mi, and h fixed. The dashed curve allows a and 
Ml to vary with priors Aa/a = 10% and AMi/Mi = 30%. The dotted curve holds a and Mi fixed, but varies h with the prior Ah = 0.1. 
Finally, the outer solid curve allows a, Mi, and h to vary with the aforementioned gaussian priors. 

degeneracy is then completely washed out. We illustrate this effect below, not only to observe the degradation of the 
confidence regions, but also because we can get a better feel as to how important the various effects are. 

We begin with the constraints on cg and flm when all other parameters are fixed. This is shown in figure ITTI as 
the inner solid ellipse (95% confidence region), which matches that of figure [TUl We now let a and Mi to vary by 
assuming gaussian priors Aa/a — 10% and AMi/Mi — 30%. This is shown as a dashed curve in figure [TTI The effect 
of allowing a and Mi to vary is staggering- the confidence regions are enormously expanded. 

We can likewise observe the effect of allowing h to vary while keeping the other parameters fixed. FigurelTTIshows the 
95% confidence regions (dotted line) marginalized over h with a gaussian prior Ah = 0.1. As we expect, one direction 
remains tightly constrained, corresponding to the eigenvector 1491 the generalized cluster abundance normalization 
condition. On the other hand, the perpendicular direction is weakened to a large degree, again reflecting that a 
floating h mixes the high and low mass ends constraints from the halo mass function into a single constraint. 

Finally, shown with the outer solid curve in figure ^2 is the 95% confidence region when a. Mi, and h are allowed 
to vary using the above priors. This last curve is the true constraint one may expect from cluster abundances alone. 
Also shown as reference with the thicker solid curve is the 95% confidence region obtained including bias information. 
All priors for this last curve are the same, except for a, for which no prior was assumed. 



